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t To demonstrate to computer programmers that the 

programming language Ada provides superior facilities forXxse in 
artificial intelligence applications, the three papers included in 
this report investigate the capabilities that exist within Ada for 
"pattern-directed" programming. The first paper (Larry H. Reeker, 
Tulane University) is designed to serve as an introduction tb 
pattern-directed programming and to the significance of the two 
papers that follow. It includes discussions of artificial 
intelligence programming and"" the rfacili ties provided by the Ada 
language', pattern-directed computation, pattern matching, and 
parsing. The second paper (Johrt Kreuter, Tulane University) describes 
a project which was part of an overall f ort to add -useful 
artificial intelligence tools to Ada through use of pattern-directed 
string processing of the sort available In the language Post-X 
(Bailes and Reeker., 1980). The third paper (Kenneth Wauchope, Tulane 
Unversity) presents a pattern-directed list processing facility for 
.the Ada programming language. Pattern lists for matching against 
source lists are constructed from a set of SNOBOL4xder ived primitives 
which have been extended to be applicable to arbitrarily complex 
LISp-like data structures. A li£t of references completes the 
document . ( JB) 
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PREFACE ' 

During the summer of 1984, under the auspices of the Summer Faculty 
Research Program and the Graduate Student Summer Support Program of the 
Air Force Office of Scientific Research, administered by the Southeastern Center 
for Electrical Engineering Education, work was undertaken^ the Air Force Hu- 
man Resources Laboratory, Lowry AFB, Colorado; concernihg use*of the pro- 
gramming language Ada* for artificial intelligence programming. Two projects 
were undertaken, both of which relate to "pattern-directed" programming, by 
John Kreuter and Kenneth Wauchope, under my direction. F have edited the 
final reports on these projects and provided them wjth an' introduction, so as to 
make them -intelligible to a larger audience than might otherwise have been the 
case. 

Mr. Wauchope, Mr. Kreuter and I would all like to acknowledge the support 
of the Air Force Systems Command, Air Fojce Office of Scientific Research, and 
the Air Force Human Resources* Laboratory (Training Systems Division). At 
AFHRL, Nlaj. Hugh Byras deserves special thanks as the person with horn we 
interacted raos}. closely,* and Dr. Roger Pennell, as the person who interfaced 
with the AFOSR/SCEEE summer program. Col. Crow, Dr. Yasutake arid Maj. 
Baxter were all very cooperative and helpful administratively, as were Mr. 
Marshall and Bo\z in the computing area;' and a number of AFHRL staff 
members. made the visit pleasurable and productive. 

LM.R. 
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INTRODUCTION: AI IN ADA? . 

Larry B. Reeker* 

Tulane University 

• '. » ''••>.* • ' - * •• 

If the programming language Ada b to be widely iwd in artificial intelligence applications, it/ 
wiH be necessary to demonstrate to programmers that it can provide superior facilities for usi» in 
that domain. One mean* of doing this is to provide facilities for "pattern- directed" programming 
within Ada. This first paper is designed to serve as an introduction to pattern-directed program- 
ming and to. the significance of the two papers that follow. It includes, discussions of artificial in- 
telligence programming and the facilities provided by the Ada language, pattoro-dirJcted computa- 
tion, pattern matching and parsing. The other two papers deal with the use of Ada for pattern- 
directed programming. One paper deals with efficient implementation of pattern matching (within 
Ada), important because pattern matching tends to be inefficient, leading to problems with exces- 
sive prortsaing. time. Another paper treats extensions of pattern-direction from strings to more 
general data structures of the- sort used in artificial intelligence. x 

1. TJffi PROBLEM 

, the question implicit m the title of this paper might be "Can artificial intelligence 
be done in Ada". It might also be " Witt artificial intelligence be done in Ada", which is 
more to the point, since anything ckn be done in Ada. The purpose of the research 
reported in the three papers comprising this report is to explore methods of doing 
artificial intelligence within Ada, using pattern-directed programming. The goal is to 
show that Ada, appropriately used, can facilitate the programming of artificial intelli- 
gence applications. 

Ada is the new standard programming language developed for the United States 
Department of Defense (DoD). II is intended that Ada be used for mission-oriented 
applications programs within DoD, replacing a variety of languages that have been used 
previously. Concepts in Ada are based to a large extent on the languages SIMULA and 
Pascal. Most artificial intelligence (AI) and computational linguistics (CL) research* on 
the other hand, is done in the language LISP, with some done in Prolog, SNOBOL4, and 
a variety of other languages. Even within DoD, such research continues to be done in 
these languages, rather than in Ada.. t But artificial intelligence research is ultimately 
applications-oriented, and what we consider to be AI today will be an important part of 
applications of the future, at all leveMrom office automation^and record keeping to 
command and control and maintenance-aiding. <tf 

If Ada is to be the common DoD language and if various "intelligent" applications 
are to be interfaced to programs written in Ada, then it would be convenient to be able 
to program AI and. CL applications in Ada. Brian Dallman (1984] has expressed the 
problem as follows: > 

Since Ada recently became the DoD standard computer language, ideally it should be 
used for all programming applications within DoD. However, there are some applications 
for which Ada, is not currently practical. One of these areas is artificial intelligence. In 
DoD, the majority of programming for AI applications is done in LISP. Consequently, if 



•Current address, Navy Center f or Applied Research in Artificial Intelligent, Code 7610, Naval Research 
Laboratory, Washington, D.C. 2037*. 
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LISft remain* the primary AI language, then Ada's, usage and acceptance in a critical 
new area of software engineering will be severely limited and'DoD's effort to establish a 
common high order language will be hampered. 

With* these considerations in mind, Dallman suggests the' following objective: 

To develop an extension of the Ada language which will provide the capabilities for AI 
programming applications. This extension can involve possibly only an Ada package or 
collection of packages. 

Two research efforts undertaken in the summer bf" 1984 toward this. objective are 
reported in the papers that follow in this report, the first, by John Kreuter, looks at 
methods fdr- implementing efficient pattern-directed computation in Ada. The second, 
by Kenneth Wauchope, deals with the development of LISP-like list processing, and of a 
language for pattern-directed computation on list structures. 

The objective here is not merely to mimic LISP in Ada, but to improve upon LISP, 
which has some well-known dejects, despite its popularity. We have chosen the 
pattern-directed paradigm of programming for this purpose. There is tocfay a body of 
opinion, shared by this author, that says that pattern- directed facilities provide the most 
effective means for creating complex programs for non-numerical' applications. That this 1 
opinion is not universally scared cbuld have to do with different individuals' program- 
ming styles; bnt.we quote here an opinion that supports our view in this matter (War- 
ren, Pereira and Pereira, 1977]: 

Pattern matching should not be considered an "exotic «xtra" when designing a program- 
ming language, It is the preferable method for specifying operations on structured data, 
both from the user's ajai the implementor's point of view. This , is especially so where 
more than one record type is allowed. w . 

The remainder of this paper will concern itself with some of the background issues 
that will provide a rationale for the work* being done and help the reader to understand 
the Kreuter and Wauchope papers. We shall first look* at the programming requirements 
of artificial intelligence and the facilities provided by Ada at present. \ * 

. ) * \ 

14. LANGUAGES FOR ARTIFICIAL INTELLIGENCE* PROGRAMMING 

Although one could writejptificial intelligence programs in any language, certain 
languages lend themselves to the task. This is largely because they, have the data struc- 
tures that are most natural for the complex/ information; processing necessary in AI built 
into the language, and because they also feature the operations that are needed to han- 
dily manipulate those data structures v 

The linked list (henceforth, "list") is pervasive in artificial intelligence program- 
ming. In early languages, lists were always represented by arrays, and they can still be 
so represented when it is necessary to use one of the common arithmetic languages, such 
as FORTRAN*. In other languages, such as' Pascal, PL/I and Ada (see |1.2, below), lists 
are implemented by the provision of a "pointer" datatype. But LISP has long been the 
most popular AI language because it focuses on lists, providing the needed list- 
constructing functions and mean* of selecting the items of a list. 



•No references art given for the well-known programming language!, at manuals ««> easily be obUined at 
, booketoTree and libraries. , 
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Another datatype that is common in artificial intelligence (particularly in computa- 
tional linguistics) is the character string (henceforth, "spring"). A string Zan be 
represented as an array of characters (as, for instance, in APL) or as a list of characters. 
But programmers tend to- think about strings in a different way. They tend/to think 
about patterns of characters, and patterns have» therefore tended to be emphasized in 
string processing languages,',such as SNOBOL4 (see §2.1, below), f ' ' 

Another data structure that can be represented as a type of "list or as a 
parenthesised string, but That is often conceptualized quite separately from/these, is the 
tree, commonly, used in games, taxonomies, structural descriptions of strings (parse 
trees), and the like. Like the string, it is often processed in terms of looking fy a pat- 
tern. This is particularly apparent in transformational grammars, some examples, of 
which will be seen in Wauchope's paper in this report. Pattem-directed/manipulation of 
trees is not natural in most extaijt languages, and Wauchope's system is aimed at mak- 
ing it more natural in an extension of Ada. / 

There have been attempts to generalize structures like trees and/lis.ts to directed or 
undirected linear graphs, which may contain cycles (trees are directed acyclic graphs 
with a single origin or "root node"), These may yet turn out to be useful, anfcit is sug- 
gested that pattern-directed'processing will also be useful in processing these generaliza- 
tions. It is not clear, however, how to treat graphs that are not/ trees directly, rather 
than in- terms of trees, for pattern matching purposes. / 

• There are factors other than data structures that characterixe the languages and 
environments in which productive AI work is taking place. Rifchardson [1983] cites the 
following' J' 

1. Focus on symbol manipulation and list processing 7 

2. Support of representations which change dynamically • / . 

3. Support of flexible control by pattern matching rather than procedure calls 

4. SupportiW programming environment, including 

a. An interactive (interpreted) language 

b. A good editor (program construc^oriented, not text oriented) 

c. Debugging faciIitie8"twaces/breaksK v 

d. Standard systems input/outpnt^unctmns / ^ 
Of these factors, the first two basicallyXhave \\o do with the processing of lists and 
strings, which. are, by their nature, Hynamb^ntities (their 'shapes and siz« change 
throughout processing). Languages which/ trV to db string and list processing with less 
dynamic entities (e.g. fixed arrays), can <UidWk-be elipaifiated from contention, unless 
these entities can be made to appear dynamic u> the programmer, pattern matching we 
will address belo^. We will not directly address/ the programming environment, except 
to commenL'tifat the types of facilities that wi are seeking to provide in' Ada can be 
abstracted from tire language and placed in a 'language less" programming environment 
(which is not really languageless, since there id always need for a representation, but is 
not textu ally' oriented, either). |n this case,/ the underlying programming language is 
alraost. irrelevant — it could be LISP, or Ada, or anything else (see jReeker and Bailes, 
in preparation]). / 

Language extensibility, discussed in 11 .3, has also been important in AI and CL, 
since the fields — and therefore,, their language support needs — have been evolving 
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rather quickly. There is po. reason to believe that the pace of change is gping to slqw 
down in the near future, so one might conclude that extensibility is another need in an 
AI language. n ^ ^ • : 

1.2. PROGRAMMING FACILITIES PROVIDED BY ADA 

*^he Ada* language (see [Honeywell, 1983,1984] and a Variety of texts presently on 
the market and under^preparation) was designed f6r the U. S. Department of Defense, in 
an attempt to promote language standardisation in applications programs and program 
reliability and maintenance, while main taitiifig program efficiency. It has a variety of 
features designed to make it useful in general applications. We shall briefly describe 
only those that are relevant to the discussion in this report and different from other 
Commonly^ used languages, such as Pascal * , , 

An Ada program may .contain various typed of program units, each of which is a 
subprogram, a package, a task, or a generic unit* "Each unit contains, a 
specification and a body* The specification contains information that mtfst be visible 
to other units, while the body contains implementation details. Units may be compiled 
separately: ' 

Subprograms' consist of the usual procedures and functions, and will not be dis- 
cussed further. Tasks are units that may be invoked and executed in parallel with other 
tasks. Generally, jk the absence of parallel computation facilities^ tasks are executed in 
an interleaved fashion, but multiprocessing is clearly possible, and it is envisioned that 
6 parallel execution will be used commonly as the hardware becomes availafctfe. As an 
example of tasks, consider a multi-player game. Each player coiuld be considered as a 
task, or as instances of the same task with different parameters (s&y, different hands in a 
fcard game, passed to the instances on invocation). 

Packages are usually used to define new datatypes and the operations on them. 
Portions of the package can be "declared private, so that details not necessary to the 
user are "hidden" from the user/ thus adding to the apparent (though not necessarily 
the underlying) simplicity of the program. Both packages and tasks are an outgrowth of 
SIMULA classes. * 

Generic subprograms or packages alio* the definition of program units that will be 
applicable to all types of a given class (rather than jtlst a single type). Derived types 
' can also be used to the same effect in many instances. 

Jn addition to the usual built-in arithmetic datatypes, Ada provides predefined 
character and string datatypes. Strings are vectors (one-dimensional arrays) of char* 
acters, indexed by positive integers. The concatenation operator (called catenation) is 
ft. The built-in string, facilities are, however, primitive, and require augmentation to be 
truly useful. Access datatypes (pointers)' aroused with record types to do list process- 
ing, much in the manner of Pascal. As with thejtstring processing facilities^ the list pro-, 
cessing facilities built into the language are cliimsy and require extension. 
f- 

1.3. APS NEEDS AND ADA'S FACELT 

It has* often been pointed out that LISP owes much of its rticcess as an AI language 
to its usefulness as a sort of high-level systems programming language, in which it is 



*Adt is a registered trademark of the U,, 9, Government, Ada Joint Program Office. 
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possible to construct interpreters for hrgher-level languages. In other words, it is ezten$%\ 
6/e, although it was not designed with extensibility in mind, specifically. The fact t|jat • 
LISP programs are thUiselVes representations of lists facilitates extensibility. (It might 
bfr remarked here that the extension languages have tended, to stjare LISP's syntactic 
shortcomings, partly for this reason:) The applicative* nature of JLISP also facilitates 
extension. Ada has been designed for extensibility — albeit of a limite^sort. using 
packages, generic procedures and tasks. It remains to be seen if this HrJj Risibility 
will prove as vstfful as that of LISP. ^^P^p 

Ada makes few concessions of a direct sort to AI (or to any particular application 
area), the philosophy being that these facilities will be built upon the basicj|ngjiage. In 
the Dallman quotation of §1, the package is mentioned. This will be the fmmary means 
of addihg Al-oriented features, including, but not limited to, string processing and list 
processing (as described in §1.1). One plight envision the creation of the following: 

1. String^finition and manipulation facilities more flexible than those iftiilt into Ada. 

2. List^pocessing functions 

3. Pattern definition and matching functions for strings and lists 

4. Means of manipulating- lists returned by the pattern matching functions 

A package of string functions has been written by Major R. Bolz [personal communica- 
tion}. In the third paper included in this report, K. Wauchope reports on the provision 
of list 'processing facilities and pattern matdtijy^ functions for lists, while J. Kreuter stu- 
dies efficiency in string pattern matching methods |hat could be implemented in Ada. 
The, manipulation of the lists returned by pattern matching functions cduld be in the 
manner of PostOC (see §2.3), as Wauchope points out. The exact manner of building in 
the "actions" of Post-X is a subject for further invAtigation. 

The tasking mechanisms of Ada lead to a number of interesting possibilities. One 
of them is tentatively explored in the Kreuter paper. It is possible to use "coroutines", 
which are just a form of task in Ada, to match patterns in a particularly elegant fashion. 
For the purposes of the type of processing envisioned in otfr project, .the pattern match- 
ing would have to provide a structural description of the item matched, as well as an 
indication of the match. This can be done in much the same manner as 
assignments, making an assignment to each subpart of the pattern. Othfcr possibilities 
for the use of taskp arise in artificial intelligence in any of the areas where quasi-parallel 
•processes have been used. An example is "word expert" parsing [Rieger and 6mall, 
1979]. ' 

2. PATTERN-DIRECTED COMPUTATION 

In a pattern-directed computation, the operation that drives. the computati6n is that 
of finding a pattern in the data and making a change in the data at that point. Pattern 
directed computation has generally been identified with the processing of character 
strings. Let us therefore turn* to string processing languages to get a feel for this style of 
programming. 



*An applicative language worki by function application. LISP is an example of an applicative hpguage. 
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1.1. PATTERN-DIRECTED STRING PROCESSING LANGUAGES 

* Historically! «pattern~ditected languages are based on a philosophy built intip the 
normal algorithm* of Markov [L954] and the canonical systems of Post [1043), both *» 
of which antedate modern electronic computers. Both Markov algorithms and Post sys- 
tems provide languages adequate for writing any program (i.e. for realising any algo- 
rithm), by encoding the data as strings, if one accepts "Church's Tresis", as most com- . ^ 
puter scientists and logicians do [Rogers, 198J). More important, from our point of view 
(since other programming 7 languages are theoretically adequate in this same way), is the 1 
fact that a particular sbyl? ot programming, which many programmers find particularly 
cordial, is natural in these languages. 

The COMIT programming language of Victor Yngve [1958] was essentially a com- 
puter implementation of a version of Markov, algorithms (labeled Markov algorithms; see 
[Galler & Perlis, 1&70|). In tHat language, it is assumed that one is Operating on a 
workspace containing a sequence of constituents, which may be inj^lual characters 
or character strings. Each step of the program consists of an operajBn which tries to 
match a pattern to a portion of the workspace (starting from jJB feft end of the 
workspace in its matching attempts and wprking to th* right) and%ffect' a change in 
that portion. As an example, the statement ' ♦ 

$l+ABC+$l+D+$+E — F+3+1+5 

would match a single constituent followed by a constituent ABC, followed by another 
jingle constituent, followed by a constituent D, followed by 'any number of constituents, 
followed "by a constituent E, and would replace all of these by a constituent F followed 
by the first, third, and fifth Hems matched by the left hand side of the "equation". For 
example, ^ the workspace cob tained 

...QRS+ABC+DD+D+EFQ+HI^+E... 
at the leftmost place in the workspace wkere the pattern matched, it would be changed 

tO ' * , ; • ' 1 

...F+DD+QflS+EFCS+HIJ..! 

COMIT had a number of problems as a programming language, but this pattern-directed 
mode of computation was not one of them, *s it turned out to b« a natural means of 3 
processing character strings in computational linguistics and related fields. It also led to 
a more successful family of languages, the first of Which was called SNOBOL [Farber et 
1054], and the last of which was called SNO^OL4 [Qriswold et 4 1973). 

The original SNOBOL language was similar to COMIT, but with a number of 
important improvements. The most fundamental of these was the inclusion of variables 
that could take on the valueggpf strings, rather than the single workspace (COMIT had a 
construct called "shelves" for storing away portions o^the workspace, but SNOBOL V ji 
string variables were handietfl)' SNOBOL also, had a iAore flexible flow of control than >*j 
COMIT and other improved features. '* ' \ ' ■ ** . , y , > j 

By far the most papula? of the SNdfepL ^mily 'of languages has been SNOB§L4. r 
Jhe papers of Mr. Kreuter and Mr* Wauchope 4n?th mention SNOBOL4 patterns, so we 
shall discuss them briefly her*. 
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As mentioned above, the COMIT workspace constituents were not necessarily jvbt- 
gle characters, but cog^d be strings. Each constituent was basically indivisible,' so if 
"AND'J was a single -constituent, it was not treated as "A", U N'\ and "D 1 '. For the pur- 
pose of illustration, however, assume that the (constituents in the example above, except 
for the constituent "ABC" are all single characters. With thisvassumplion, the COMIT 
statement * ' . 

^ H+ABC+$l+D+$+E = F+3+i+5 + ' , 

cpuld.be written in SNOBOL4 as . * * . % . 

« 

LEN(l) . VI 'ABC' LEN(l) . V3 «D» BREAK(^*). V5 'E' - T' V3 VI V5. 

In SNOBOL4, all strings aYe based on single characters; the concept of multicharacter 
constituents does not exist (in the pattern matching portion of the language, at least). 



2.2. SNOBOL4 PATTERNS A ^ 

, Patterns, in SNOBOL4, are data objects, and may be given names in<£signment 
statements. Patterns-are constructed out of pattern primitives, including variables and 
string constants, using pattern operators. They may also contain assignment state- 
ments. ~ ' . s • 

.2.2.1. Pattern Operator! * 

Concatenation! (bjank space) e.g. A B matches anything matched by pattern A fol- 
lowed by anything matched by pattern B. 

♦ 

Alternations (blank)|(\>lank) e.g. A | B matches anything matched by pattern A, if a 
match is found, If not it matches anything matched by pattern B, if that can be found. 
If neither is found, it fa|ls. ^ 



(ParrathMM may be used«in the conventional way to group items and establish the 
order of operations.) 

2.2.2. Pattern Variables 

POSfl) matches a null string after the i-tfc character. (PO8(0) is the left end of the 
string). ^ ' 

RPOS(I) matches a null string tefort the i-th character from the right. (RPO8(0) is 
the right end of the string.) " J 

ARB matches an arbitrary string (the shortest op^possible within the context of the 
pattern in which it is included). 

REM maVettls everything to the end of the string. 

BREAK(x) matches an arbitrary string up to — but not including — the first oc- 
currence Of any character in the string x (e #. BREAKfabe') matches a string up to 
one of the characters a or b or e that does not itself contain any of those characters). 

SPAN(x) matches an -arbitrary string made up of characters in x (i.e. it BREAK* at 
anything not in x). 

ANY(x) y matches,any single character in x. * 
NOTANY(x) matches any single character .notth x. 



r 
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- LEN(n) matches an arbitrary a-character string. 

BAL matches an arbitrary string balanced with respect to parentheses*. N 

Other pattern variables include FAIL, FENCE, ABORT, ARBNO, TAB, RTAB, 
SUCCEED, O, for explanations of which the reader is referred to the SNOBOL4 manu- 
. al [Griswold et al, 107l|. „ 

2.2.8. Other primitive! 

Any Airing (enclosed in single or double quotation marks) may be used as a pattern. It 
matcfes exactly itself. ' 

NULL matches the null (zero length) string. 

♦ 

1 J.4. Assignment Operators * 

Immediate assignment (made to a matching element of the pattern as the pattern . 
J match is attempted): '(space)$(space). 

Conditional Assignment (made only if the whjjrfT pattern match succeeds); 1 
(space).(space) (e.g. X . Y assigns whatever is matched by pattern X to variable Y 
when X is part Of a successful pattern match). ; 



,13. POaT-X PATTERN MATCHING 

The ultimate goal of' the, work reported in these papers is to make possible the 
incorporation within Ada of packages that allow pattern matching of the sort defined 
within the Post-X language [Bailes and Reeker, 1980a,bJ. f ost-X incorporates pattern 
definition and matching into an applicative framework. In- doing so, the powerful pat- 
tern definition facilities of SNOBOL4 have been retained, while other aspects of the utili^ 
ration \f patterns have been ^improved. 

In an applicative framework, <he pattern match must retain a value that can «be 
acted upon by other functions. The pattern itself has been generalized to a more power- 
ful object, called the form. 

) A PostrX form consists of a series of alternative\Datterns and i elated^ actions. 
Each pattern is very nluch like a pattern in SNOBOL4./A form may be passed parame- 
ters (by value), which are then used in the pattern or action portions of tjjat form. 

A pattern determines the structure of the string to which it is matched. The pat- 
tern contains a sequence of concatenated elements, which arc themselves patterns, primi- 
tive patterns (as in. SNOBOL4), or strings. The value returned by the pattern is either 
false (if it fails to match) or a psrte tree designating the structure of the string that 
corresponds to portions of the pattern. Portions of the parse tree can be accessed by the 
use of selectors and used in the action portion. 

As an example of some of these ideas, consider the definition of a forrf REPLACE 
which takes a parameter ORAM (a context free grammar that consists of a sequence of 
rules, with the nonterminals surrounded by angle brackets). 

REPLACE ORAM "<V* BREAK" >"*">" 
{$<-((REPLACEGRAM)< 
SELECT JIHS 

(ALT_LIST< 1 . # ■ 

(LHOINDI2 <ORAM))) 

15 
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'$>} * 
|NUUL{»}i 

The * is a concatenation operator. Post-X allows an alternative postfix representation of 4 
function composition, using an explicit postfix operator •, which is sometimes easier to < 
read: 

REPLACE GRAM :« "<"*BREAk">'' MC >" 
{$< A GRAM ' 
(<(LHS JPIND $2). 
(<)ALT_LIST. \ 
SELECT JRHS. 
(<XREPLACE^GRAM) 
*$>} " \ 

|NULL{$$}; 

• The form REPLACE, in whichever form it is written, expects to be passed a gram- 

mar, as described above. It utilizes the forms LHSJFIND, ALTJLIST, etc., which must 
be*defincd elsewhere in the program. It first finds the leftmost' occurrence of a nontermi- 
nal in that grammar. In the first* alternative, which utilizes the SNOBOL4 function 
BREAK, the first occurrence, of i s automatically denoted by $1, the item matching 
BREAK" >" (the second item in the pattern; notice that Post-X does not require 
parentheses around the arguments of built-in functions, i# order to lessen the number of 
parentheses necessary) will be denoted by $2, and the 11 >^ following will be denoted by 
$3. These "$ variables" are all available to be uSed in the action or in other parts of the 
\ pattern match. 

\ REPLACE uses the nonterminaPfound ($2) as a parameter to LHSJFIND, which is 
kpplicd to the gramfnar GRAM to return the right hand alternatives. Then 
SELECT_RHS selects an alternative, which is placed in the context of the nonterminal 

'matched by the pattern part of the form. Finally, REPLACE is matched (recursively) to 
the result. If the first alternative fails, it means that there is no nonterminal. In that 
case, the second alternative will be matched, and will return the entire string, which will 
be a string in the grammar generated by GRAM. 

Without understanding Post-X completely, it can be seen that pattern matching 
and function application are the fundamental operations. Furthermore, it is necessary 
for the pattern match to return a structural description of the string (the grouping of 
higher-level units in the pattern and the selection of corresponding units of the matched 
string is not illustrated in the example, but often turns out to be very useful). A portion 
of P«t-X has been implemented as STRIP, and its design rationale has been explained 
in detail by Paul Bailes [1983]. 

2.4. PROLOG AS A PATTERN-PIRATED LANGUAGE 

The language* Prolog hal been chosen as the language of the Japanese "fifth genera- 
tion" computer initiative. It is a language that is becoming more and more popular in 
artificial intelligence and computational linguistics. A standard reference is (Clocksin 
and Mcllish, 1981); 

;^ A program in Prolpg consists of a series of clauses of the logical form 

A, & A 2 & ... & \ D C 
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represeijted in Prolog as "C :- A v A v ..„\"t and interpretable as "to prove C, prove A {t 
then prove A 2 , then prove A " 

The elementary terms, such, as "A" above, are predicates anfl arguments (which 
nlay be variables). For instance, "A" might be "BIGGER(x,y)'\ A problem is solved by. 
a Prolog program by finding an instance of a formula that is true and" returning the 
parameters that instantiate that instance. If the data provided in the program contains 
pairs of, say people who are bigger t^an other people, then it would be appropriate to 
ask whether Paul is bigger than Jphn BIGGER(Paul,John)") or to find the people 
bigger than John ("?- BIGGERfoJohn)"). The process of attempting^ to find true 
instances is thi logical operation of unification, which can also be viewed as pattern 
matching. Much of the popularity of Prolog is due to the naturalness of this pattern 
matching method of programing. In fact the quotation from (Warren et al\ in §1 is 
talking specifically aboujTProlog, 

The pattern matching embodied in SNOBOL4 and in Post-X is a more limited form 
than in Prolog, in that there are control mechanisms other than pattern matching (pri- 
marily function composition or application in Post-X, both sequential control and func- 
tion composition hi SNOBOL4). The purer approach of Prolog (although, like the pure 
applicative control of LISP, often modified ip practice) has advantages and disadvan- 
tages. We feel that the Post-X framework will be more naturally embedded in the Ada 
framework, anfl tha^if this is carefully done, it can result in an excellent language for AI 
programming. 

Within the LISP community, pattern matching has been recognized as important, 
but has not generally been viewed as fundamental. Thus Winston and Horn [1981] 
include a chapter on pattern matching, commenting that 

Although LISP itself has no pattern matching built in, it is easy to write pattern- 
matching functions in LISP., Hence, we say that LISP is a good implementation 
language for pattern matches. 

An important experimental language built upon LISP, PLANNER [Hewitt, 1969]> par- 
tially implemented as MICRO-PLANNER [Sussman cfal 1971], features pattern- directed 
procedure invocation. Winston and Horn conclude that many problems remain in pat- 
tern matching, including how ,to deal with more general data structures (the problem 
that .Wauchope [1984] is tackling). They also point out that a matcher which can do 
partial matches and report on how close they are to a full match would be very useful. 

Another language that deserves mention in any discussion of pattern-directed pro- 
gramming is Awk (Aho et a/, 1979[. Although Awk's patterns are of a restricted sort (for 
purposes of efficiency), it is very easy to use, and is widely used as a utility within the 
UNIX system, as well as in file processing programs. 

*. PATTERN MATCHING AND PARSING 

We have described some design aspects. of pattern-directed languages. Of course, 
designing the language is only half of the task; one must also implement it. We will now 
discuss a central problem of the implementation of pattern-directed languages — the 
pattern matching algorithm itself. Because we are interested in implementing facilities 
along the line of Post-X, we will Ue interested in passing back a structural description of 
a string. This is essentially the same thing as> parsing a string according to a context 
free grammar, so we shall next examine context free parsing. 

io 17 
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1.1, CONTEXT FREE PARSING 

Pattern matching can be considered to be a restricted form o( context free parsing. 
Without going into the details of context free grammars (a good basic reference is [Lewis 
tod Papadimitriou, 1981]; a more advanced one is [Harrison, 1978]), it is possible to see 
why this is so. A pattern P consisting of a concatenation of patterns P x ... P will match 
some string S if and only if there exists a decomposition of S iato substrings such that S 
= s l ... s a and such that all of the s. are substrings matched by the cbrresponding P r If 
there are alternative patterns P % | ... | P m (using the SNOBOL4 alternation operator 
then one needs to try matching P , then P y etc. One might want to return eithtr the 
first match or all matches. The operations of concatenation and of alternation £re also 
the basic operations of context free grammars. Context is often relevant to thl parsing * 
process, but if it is strictly the context of primitive patterns within the overall pattern 
(as it tends to be in pattern matching), the power of a context free grammar will suffice. 

In the case of I 5 = P x ...,P^ an equivalent context fnte gramrflar would have the 
production P P ... P . In the case of P =? P x | ... | P a> the grammar would have pro- 
ductions P -> Py P P a . In either case, finding a successful pattern match is 
equivalent to recognizing the string S by the corresponding grammar (determining 
whether or not it can be generated by the grammar). 

As explained in section 2.3 above, we desire to return a structural representation of 
the string matched — that is, a parse tree that indicates how the match took place. It 
is then possible to structure the patterns utilised so that this information will be useful. 
PostOC makes use of the parse information to select out certain portions of the matched 
string for modification in the action portion. j * 

One of the key issues in the efficiency of parsing, addressed in the Kreuter paper, is 
the control of nondet«rmlnbm, A nondetenninis tic, algorithm [Floyd, 1907] is one that 
has M choices" of various alternative computations at certain points in its operation. 
These choices can lead to a successful completed computation or may lead to failure. 
The idea of the nondeterministic algorithm is that if a failure occurs, then another choice 
can be tried. One could, in fact, try all choices at once if one had sufficient parallel com- 
puting capabilities, and this njay be possible in the future. At present, wfe implement 
nondeterministic algorithms oh the machines that we have, which are designed for serial 
computation. One way of implementing theim is to backtrack when a failure occurs and 
try another choice. Another is to try to keep around enough information to be able to 1 
try all alternative choices in a "pseudo-parallel" manner. These alternatives are best 
illustrated by looking at some parsing algorithms. g 

3.1.1. Recursive Dwctnt 

Suppose a context-free grammar has a rule of the form 

X ,-V, 

where each of the Vj is either a terminal symbol or a nonterminal. .A racurtW« drtecnt 
parsing algorithm will parse a string • 

VrAA 

that is suspected of being generated by X, by calling a jroutine 

PARSE(X,,s r .8 t ) 

ii 
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which must consider the possibilities that Y, matches the empty string and Y 2 matches 
s r ..s a , that Yj matches^ and Y 2 matches s 2 ...s >( etc. To check the first possibility, it 
calls " 

PARSE(Y 1 ',A*) followed by PARSE(Y 2 ,s r ..s B ). 

To check the second possibility, it calls 

PARSEIYj.Sj) followed by PARSE(Y 2 ,s 2 ...sJ 

and so forth. If any of the Y. is nonterminal, then PARSE(Y.,x) will make further calls, 
according to the grammar. If.Y j is a terminal (or A), then PARSE(Y.,x) returns an indi- 
cation of success if and only if x is also terminal (or A) and Y. = x. The alternatives are 
tried in a nondeterministic fashion either until a successful parse is found or until all 
successful parses are found, depending on which one wants. 

The algorithm .can be extended in a straightforward fashion to cases where the 
right hand side of the production consists of more or less than two symbols, either termi- 
nal or nonterminal (or A). 

« 

As an example of recursive descent parsing, consider the grammar 

S — aTc 
S — SU 

s~* t •«■ 

" * U-d. \f 

on the input string *ecd. We start with PARSE(S,aecd), which calls 

PARSE(a,A);PARSE(Tc,aecd), 
PARSE(a,a);PARSE(To,ecd), 
PARSE(S,A);PARSf)(U,aecd), 
PARSE(S,a);PARSE(U,ecd), and 
. PARSE(T,a«cd). , 

( . Notice that PARSE(a,a) succeeds, according to the criterion for success given 
above, whereas PARSE(T,aecd) will call PARSE(c,aecd) v which will not succeed. A real 
problem occurs with the calls to PARSE(S,x), for any x* This is because of the produc- 
tion S SU, which means that PARSE(S,x) will be called again and again recursively, 
and that the program will therefore be in a loop. Any production of the form 

- ■ x i -.x.w. 

(where W is any string of nonterminals and terminals) will cause this problem. There 
are various solutions to the left recursion problem, one of which leads to prcdictjve pars-* 
ing, discussed in the next section. 1 

3.1.2. Predictive Parsing « 

In order to avoid the problem of left recursion and the infinite loop that it can 
cause in recursive descent parsing, one can put each production into Greibach normal 
form [Greibach, 1905],where each production is of the forjn ' 



* The symbol A it used for the empty (lerolength) etring. 
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where is & terminal character and the XV are nonterminals (including tntf case n — 0, 
where no nonterminals art found on the right hand side). Actually, it is always possible 
to find an equivalent (in the sense that*4( generates the same language) grammar in 
Greibach two form — that is, with all productions of one of the forms 

> (i) X. -♦.a j X k X | 

(") X, - a j X k . 

(iii)x:-a. 

and possityy 

(iv)Xj— A 

A recursive descent algorithm will then work without getting into a loop. It is also pos- 
sible to write a simple non-recursive algorithm using a stack (which basically does what 
tbs computejr r would dp in implementing the recursion, but does not generally have to 
push down as much information into the stack, and is therefore marginally more 
efficient). 

This method of parsing r using a stack to keep the information needed to do the 
recursion, was used early in the history of computational linguistics, by Kuno and Oet- 
tinger (1962], and is called predictive parsing. It operates smoothly and efficiently in 
many naturally occurring cases, especially if the strings do not become too long. An 
informal description of predictive parsing is as follows: 

1) The algorithm is initialized by placing S* in the pushdown store and scanning the left- 
most terminal symbol of the input string, 

t) Whenever a character a is ynder scan and Xj is on top of the stack, pick one of the 
productions with X, as left hand side and a^ as leftmost character on the right hand 
side, pop up X r and push Xj followed by X k (ao that X fc will be on top), X fc alone, or 
nothing, depending on whether the production is of form (i), (ii) or' (Hi) above, respec- 
tively, and move on to scan the next character to the right in the, input string. (In the 
case of a A right hand side, X 1 can be popped up without moving on tAcan another 



character.) 

3) Accept the string if and only if the end of the string is encountered precisely at the 
same time that the st^ becomes empty. Otherwise, the algorithm fails.. 



Notice that the formulation uses the "nondeterministic" phrasing Hp^^pne of ...", 
and the notion of the algorithm "failing" in certain cases where it^'^ilVTIRtl^ar that 
no parse exists. This means that if the algorithm makes a mistalfl||^ 
that does not lead to a successful parse), then it can backtrack ofr'^lfrik&Land make 
another choice — until po more productions remain to be picked. Any slch recognition 
*them% must backtrack anyway to try all alternatives if •//, rather than merely one, of 
the., -parses are ip be found* The nondeterminism inherent in the predictive algorithm 
means that the algorithm will require another stack (for the backtracking) and will tend 



•When S ie used m a nonterminal lymbol in a grammar or ii a itack symbol, it will always be used to 
denote the axiom, or rapfeymbol, of the grammar, m it conventional. 
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to take exponential time in "bad cases". Some of the pattern matching cases are bad 
enough that the combinatorial explosion of possibilities slows the process appreciably. 
The algorithm discussed below (Barley's Algorithm) tends to be faster in these "bad" 
cases. • ' I 

A «... 

As an example of predictive parsing, c^psider the grammar of §3.1.1, converted to 
Greibach two form to obtain: '• M . 



S 
S 
S 
V 
V 
X 
T 



aTX 
cTX 
eV 
d 

dV 
cV 
e 



The following table shows the actions on the stack that would result from reading*a 
given symbol in the input for each possible symbol on top of the stack: , 



Symbol on 
Top of Stack 



Action table for the grammar 

Action 



Next Input 
Symbol 



s 


a 


pop S, push X, push T 


s 


c 


pop S, push V 


s 


c 


pop S, push X, push T 


V 


d 


pop V 


V 


d 


pop V, push V 




c 


pop X 9 push V 
pop T ' 


T 


c , 




On-the input string aecd, the algorithm's behavior is as shown in the following table: 



Actions on the string "need" 



SUek ' 


String to 
be RcU 


SUck 


S 


.aecd 


T 
X 


T 
X 


.ecd 


'X 


X 


.cd 


V 


V 


.d 





4 



Now let us consider the generation of a structural description. We start at the top, 
with an S. Each time that a symbol X { is popped up and replaced by a right hand side 
q, we can expand the parse tree portion X t to Xjq], always expanding in a leftmost 
fashion. The parse tree for. our example is: 
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' . . s 

/IV . . • ' 

' aT\ 

IX 

•A.- 

c V ' , • 

. I 
d 

3.1.3. Ear toy's Algorithm 

Earley's parsing algorithm is mentioned in both the Kreuter and Wauchope papers. 
One of Kreuter's tasks was to implement Earley's algorithm in Ada as a method of pat- 
tern matching. The original foYm of this algorithm is due to Jay Earley [1970], and the 
form used by Kreuter is based on a modification due to' Graham, Harrison and Ruzzo 
[1976] (see also [Harrison, 1978]). Earley's algorithm is as efficient as any "practical" 
general context free parsing algorithm known (there are some theoretical results that are 
marginally more efficient, including [Valiant, 1975]) and does not require t^at the gram- 
mar be converted to any special form. The efficiency of Earley's algorithm Js achieved 
by carrying around possible analyses in parallel, rather tijan backtracking, as in predic- 
tive parsing. The analyses cannot actually be done in parallel, of course, on the serial 
machines that are standard today; but the algorithm eliminates the repeated generation 
of information on partial parses that is'inherent in the usual backtracking method * 

. •' j* • ' • 

We will now give an informal description of thV modified Earley algorithm as a 
recognizer (the reco.very of the parse tree will be discussed later), based on the descrip- 
tion of Earley [1970], with modifications: % . 

The algorithm scans an input string a, ... a^ from'left to right. As each symbol a, is 
scanned, a set of "states" S, is. constructed which represents the condition of the recogni- 
tion process at that point in the scan. In the modified: algorithm, each state set S i is 
represented a* the column of an trr jM\ Et * n 8ttA « in tnc wt represents (1) a production 
such that the algorithm is current|f%|an1ring a portion iof the input string which is 
derived from its right hand side (the portion to the right W the arrow), (2) a point- of- 
scan marker (dot; also called i cursor) in that production which shows how much of the 
production's right hand side has. been recognised so far. In Earleu's original formulation, 
a pointer woo alto kept to the pooition in the input string at Jwhich the algorithm hegan to 
look for an instance of that particular production. This is hot necessary when using the 
array format. . I 

I The algorithm continues as long as any one of three operations is applicable to a pro- 
* duction in the array. The operations are mutually exclusive. The pradleior operation 

is applicable to a state when there is a nonterminal immediately to the right of the dot. ' 

Its effect is to add one new state to S, for each alternative Of that nonterminal. The 



'Earky's algorithm l ean exa mple of a more general method of reducing exponential processes to polynomial 
processes. CThi* topic will not be explored here, as it is beyond the scope of this Work, and, in fact, ha* not 
b**n *y*t*matical|y developed in the literature. For a discussion of eorae relevant consideration*, *ee (Tueei, 
fsrikesmng] and (Pereira and Warren, IW3|. The latter reference abo point* out thai Earley'* algorithm i* 
actual^ a particular caw of chart parsing, ay*t*matis*d in (Kay, 1M0|. 
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Its effect is to add one new state to S { for each alternative of that nonterminal. The 
point-of-scan dot is placed at the beginning of the right hand side of each production ad- 
ded by the predicto?; since none of its symbols has yet been scanned. In the modified af- 
iorithm, tke$t are alwan* placed in the array at portion (i,i). Thus the predictor adds to 
the array at (i,i) all productions which might generate substrings, beginning at a 1 (but 
only adds one copy of any production, thus avoiding the danger of infinite looping in- 
herent in recursive descent)/ 

§ 

The Manner is applicable just in case there is a terminal to the right of the dot in 
some production in column i. The scanner compares that symbol with a 1 / and if they 
match, it adds the production to column in the tame row a* the original production 
in cplumni, with the dot moved over one position in th* production to indicate that that 
terminal synjbol has been scanned. After the scanner is applied to all productions in a 
column to which it is (applicable, Jthe algorithm moves on to the next column.' 

*^ The thtfcd operation, the conjipletcr, is applicable to a production if its dot is at the 
end of its right hand side. If the left hand side of the production is "P" and the produc- 

* tion is in row i, then the completer adds all productions from column i which have P 
directly to the fight of the dot, moving the dot one place to the right (i.e. over P). In- 

. tuitively, column i is the state set the algorithm was in when it predicted the possibility 
of the production just completed (the one with left hand side P). Now that P has been 
successfully found, the completer goes back to all the states in S { which caused the algo- 

la rithm to look for a Pi "knd moves the dot over the P in these states to shpw that it has 
been successfully scanned. 

In the ca$e}ofrule$ withJLtithi ^^lA^Jttto^j^mr^ttf^cr modification*, to each of the 
proceae* mentioned need to be made. The$e will not be detailed here/ but may be found 
in any of the reference* mentioned above. v V 

9 f 
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Examples 

The algorithm described above, using the grammar 

S -* aTc 
S — SU 
S — T 
T W C 

U-d / & 

to recognise the input string 

aiecd * 

i 

produces the array below. 
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i 



.aecd 



Unread input (. indicate* point of *can) 



S 
S 
S 
T 



(U) 
aTc 
SU 
T 



.ecd 



a.Tc 



(*.*) 



.e 



.cd 



S 



(19) 

aT.c 



1*.*) 



T-e. 



S aTc. 
S r* S U 



(U) 



u 



(U) 

.d 



su. 



(9.5) 



u 



ft* 

d. 




added 



Commentary on the. Example: 
The Arr*g 9 

Si>^e th* input string. is or length 4, the array will have 5 columns. These are labeled in 
^y(ht example with the input string, with a point-of-scan marker to show how much of the 
string been read prior to entering any productions into that column. 

Cttmmnt: * . 

The array is initialized by the entry into (1,1) of the productions S -+ .aTc, S -+ .SU, 
S .T. The predictor then causes th* production T ^ * to be added to (1,1). Since 
* the predictor and completer are not applicable, the scanner is invoked, causing entries to 
be placed in column 2. 

CtJttfR* f : 

The scanner looks at the next character and finds it to be a. Since the only production 
in column. 1 with an a at the point of scan is 8 -+ -aTc, in row 1, the production S -+ 
alTc is entered into (1,2). The predictor is now applicable, causing T td be 
to (2,2). 
CtJama S: 

The scanner first operates on the production T causing T a. to be entered into 
(2,3). This latter causes the completer to be invoked. Since it occurs in row 2, the com- 
pleter searches column 2 and finds one production, 8 ~» feTc, 'with a T immediately fol- 
lowing the point-of-scan marker. This production is therefore moved horizontally to 
column 3, entered into (1,3) as 8 aT\c 

The scariner now finds a c at the point of scan and moves 8 -+ aTVc into this column as 
S -+ aTc* The completer then finds 8 -+ JB\3 in column 1 and moves it to (1,4) as 8 
-+ S.U. This causes the predictor to enter U «d into (4,4). 

CthmifS: 

The scanner is used again, producing the production U d. in (4,5). This causes the 
completer to search column 4 for a production with 0 at the point of scan, and it finds 8 
-+ 8.U and. moves it across to (1,6) as 8 SU. The fact that there is a production in 
(1,5) with an 8 as left hand side and the fact that the string is now indicated together 
• ' indicate that a successful parse has been found. 
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T6 recover the parse, it merely needs to be noted that the portion of the right hand 
side of a production ity the array to the left of the point of scati has successfully been 
matched to a portion of the string being parsed that is given by the coordinates of its 
location in the array. If the production is in the array at {i,j), then that portion of the 
right hand side has been matched to positions i through j-1 (notice that»this means 
productions added by the predictor, which have nothing to the left of the dot and a 
(i,i) have been matched to nothing, as we wo\ild desire).' Completed productions 
ones with the dot at the end of the rigjit hand side) constitute a successful parse of a 
portion, of the string, which, ifc a parse tret; would be dominated by the left handjside of 
the production. In pur example,* for instance, S SU. is found in (1,5), sq it consti- 
tutes a successful parse of positions 1 through 4, that is, of the whole string, while U — » 
d. parses only positions 4 through 4 — only the fourth character. The production S — ► 
*TC. parses positions 1 through 3, , while T — ► parses position 2. The single charac- 
ters ft, e, c, and d cover positions 1,2,3, and 4, respectively, of course. 

An algorithm to recover the parse must start with the (l,n) position, where n-1 is 
the length of the string, then check (l,k) and (k,n) for each k — 1 ... n. For eacli one 
found, a recursive call will (heck in the same manner until everything is reduced to sin- 
gle Symbols. The details can be found in [Harrison, 197$] '(though the reader should be 
aw are that he numbers his array from zero). rather than one). 

3.2. USING BARLEYS ALGORITHM TO MATCH PATTERNS 

Once the parse array has. been formed, all parses can be found in time proportional 
to n 2 , where n is the length of the input string, using the algorithm mentioned* in the 
last section. Formation of the array itself takes, in the worst case, time proportional to 
n* because the completer operation potentially has to examine a; whole column of the 
array, which tafces some multiple of n operations, for each of some; multiple of n entries 
(see [Harrison, 1978] for a detailed analysis). Storage for a parse array is proportional to 
n 2 (since it is two-dimensional and each dimension is proportional to n) but can be large 
if the grammar is large. However, once the results of the pattern match are no longer 
needed, the storage can be reclaimed. Ther< is also the possibility of storing the array as 
a list if it is sparse. These factors need to be investigated, as Kreuter is continuing to do 
(see the second paper of tfeis collection). 

In order to understand the emphasis of Kreuter's paper,, let us consider briefly how 
the pattern primitives of SNOBOL4 (§2.2), as used in PostOC (§2.3) would be treated in 
an appropriately modified Earley's algorithm. 

Concatenation and alternation and the grouping thereof by parentheses are 
reflected in the composition of the grammar niles. Thus . 

' j P - Q | ('abc' POS(S)) LEN(12) REM * . *' 

(wjiere Q is another pattern) would become the grammar 4 

P~>Qv 

P-RLEN(12)REM 
R 'abc* POS(S) ' 

■ ' * • 
Assignment operators are not used, since the branches of the tree according to the gram- 
mar are used in the PostOC action statements. Notice that the parentheses used to 
group the elements of the pattern have affected* the grammar produced, and will 
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therefore aflfect the branches of the tree produced upon a successful match (parse) and 
the selectors used to refer to matched substrings in the action portion of the form (see 
§2,3). 4 ^ 

The pattern primitives can be dealt with when the Scanner or predictor arc used. 
They are treated as follows: 

POS(l). If POS(i) is found at the point of scan, then the productionl»08(l) r-J A* can 
be entered by the predictor in tt(*t column if and only if it is column ^Vindicating 
s that i characters have been scanned). i^. 

RPOS(l). Treated similarly, with the count being from the end of the Btring*(this will 
mean that the string length needs to be obtained before the parse, which is advi^ble ft* 
purposes of efficient storage allocation for the array anyway). 

ARB, If ARB is found at the point of scan in column i, then the production ARB -* 
A; is entered into position (l f i) in the arfray, ARB -» is entered into (U+l) where s f . 
is the i-th character. in the struig being matched, etc. to the end of the string. (It is also 
possible to deal with ARB on a column^by-column basis by entering only the production , 
nfentioned in (i,i) and the production ARB ^ ARB This treatment will produce a * • 
right-branching parse tree which can be modified into the desired tree by post- 
processing.) / , 

REM. When REM is found at the point of scan in column i, then the production REM j 
* r+ X. can be entered into (i,n), where X is the remainder of the string and n is the last 

colufon (i.e. X is & x ... sj. / « — 

BREAK(x) is treated like ARB, except that it is necessary to check for ^he break char- 
acters and enter the productions accordingly. That is, BREAK(x) y. is entere4 
into the array only if y does not contain any occurrence of a character in x. 

SPAN(x) is treated like BRtSAK, except that the string y in the description must con- . 
tain only characters found in the string x. 

ANY(x) matches cwly single characters. Thus the production ANY(x) jr. is entered ' 
only into (i,i+l) — and only if the i-th character is in x. 

KOTANY(ji)i is treated analogously to ANY except that the i-th character must not be 

in x. , 

LEN(n) is treated like ARB, except that its production is only entered at (i,i+n). 

BAL can be done analogously to ARB. If productions are to be entered into (i,i), 
(i,i+l),... up to the end of the string, then it will be necessary to check for balance in the 
strings before entering the appropriate productions. Alter natively the productions HAL 
* A. BAL .'(' BAL and BAL JBAL BAL can be entered at (1,1), and the' 
parse obtained can be poetprocessed to obtain the appropriate tree structure. 

Othjr pattern variables are dealt with analogously. The treatment of NULL and literal 
strings should be obvious (just the usual treatment In Earley'e algorithm). 

S.8. ALTERNATIVE PATTERN MATCHING ALGORITHMS 

There are various fast string matching algorithms available, but these were not 
considered in the research because of the requirement for returning a structural descrip- 
tion, in order to enahjte Post-X-like processing. For some specific string matching algo- 
rithms and references to the literature, see [Liu and Fleck, 1070]. The reader should be 
aware that* the SNOBOL4 patterns are more powerful than, for instance, regular expres- 
sions; which certain algorithms, such as those employed in text editors, match rather 
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quickly. , •' »■ ^ \. V ' 

' ' ' - . ' • > ,'■ if 

4. EXTENDING THE PATTERNnDIRBCTED PARADIGM 

.Given the fact that so much of A}> and even consists of manipulation of lists 
$nd trees, rather than strings, the fact that SNOBOb4 y^ai pot considered a n*t process- , 
ing language was undoubtedly one reason tftat tf did not become a major AI program- 
ming language, despite its excellent pattern-directed strin^processing facilities (and, not 
incidentally ^free distribution by Bell Laboratories): SN08QL4 does allow the definition 
of datatypes, which are basically list structure, 'tyuttbe language for using these is not 
pattern-directed, ami the trace' ajad dump facilities are not developed sufficiently ,to make 
them easy to use. (Other reasons for SNOBQM's failure' to captur^>the "AJ market" 
have to do with control structures [Bailes and Reeker, r980a]). \ v 

The PosJrX language (see §2.3), while originally designed fbt string processing, 
sought to provide pattern matching on treear in a manner that woujd b* analo^pus ih the 
string pattern matching facilities of the language. The method of doing this was to 
extend tfce use of the SNQBOL4 pattern BAL (see §2.2.2) to |Hpw the specification of 
the value of the structure within a balanced set of parentheses designating a tree. In 
addition, som* tree functions were added for use within the action portion of the forms 
dealing with trees (see [Bailes and Rjeeksr, 1980b]).. The attempt-was only partially sue*/? 
cessful. Though the specifications -were easy to write and easy to read in some cases, 
they were confusing in others, partially because of confusion among labels within tfees 
and data items on the leaves of trees. Wauchope seeks to remedy this deficiency in the 
work reported in the third ( j>aper of this report, and iti continuing work. As Winston * 
and Horn [1981] have said, "Building in these "capabilities can be hard. The literature 
offers little guidance", 

5. CONCLUSION 

The following papers address in a tentative way two important issues in the provi- 
sion of pattern-directed string and list processing in Ada, Kreuter's paper deals with 
alternative algorithms for string pattern matching which will also return parses or struc- 
tural descriptions of the string, whfcre structural indicators are built into the patterns, in 
the manner o( Post-X. 'The matching process for such general patterns is time- 
consuming, so efficiency will be an important consideration. W&uchope's paper makes a 
further extension pf the pattern-directed paradigm — to arbitrary LISP-type datra struc- 
tures. This work should lead to a useful alternative language for artificial intelligence, 
using Ada, and is being continued. ^jL 

In consideraing future applications of artificial intelligence, it $ important to realize 
thai' game playing, language ptbcessinp, expert systems, and the other sorts of things, 
that we conventionally think of under the umbrella of AI are going to be combined with 
simulations, numerical programs, large file processing applications, and the like. For 
these "conglomerate" applications, the languages that have^jrrfcit commonly been used in 
AI research may not he\he most useful. In our viewf Ad* cato provide an excellent 
environment for artificial intelligence applications of the future because of its flexibility 
and. generality. The problems addressed in this report — provision of appropriate facili- 
ties through packages and making those facilities efficient enough that large and complex 
/programs will be feasible within them are *oriW that need to be addressed if this 

potential. is to be realise^. # ' , 

• « * '\ 

... i ♦ 
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To test the usefutneu of Ad% in mrtiflciml intelligence programming, it is desired to develop 
Ad* packages which mirror the fespajb^itiea of languages such as LISP or SNOBOL4 which have 
proven their Utility in artificial intelligences The development of SNOBOLtllke pattern matching * 
packages was undertaken by this author. Viewing pattern matching as an extended parsing prob- 
lem, Ada packages fo| paiterh matching utilising Earley's efficient parsing ajftrithm yere 
developed, as well as packages implementing the more traditional backtracking (recurilvto descent) 
approach. Since a full Ada implementation was not available at the time, theee efforts, should be 
considered as preliminary, but indicate a direction for further research. 

1. INTRODUCTION! RESEARCH OBJECTIVES 

This project is part of an overall effort to add useful artificial intelligence program- 
ming tools to Ada. One such tooj is pattern-directed string processing, of the sort avail- 
able in the language Post-X (Bailes an d Rce ^ifH980a,b]. This involves the implementa- 
tion of pattern matching algorithms in Aaa which actually taturn a parse tree of the 
match of a pattern according to a structured pattern. In other words, pattern matching 
according to a context-free grammar with primitives like those of SN0BOL4 [Griswold 
et 4 197 lj is the goal. V q 

Parsing can be an expensive operation, timewise, to use over and over as a basis of 
a programming language — especially when the patterns are as general as those in SNO- 
BOL4. The purpose of the research reported here was to consider'the particular parsing 
algorithms that can be implemented using Ada packages* Because a number of con- 
siderations are involved, including the basic efficiency bf the algorithm and the efficiency 
x>f its implementation in the Ada environment, it is advisable to-do this in an experimen- 
tal manner, implementing and testing. I* 

2. DEFINITION OP PATTERNS 

Before any pattern matching algorithm could be implemented, a suitable definition 
— one easily represented in Ada — had to be developed. The packaging facilities of 
Ada would then allow the pattern representation and pattern building functions to be 
eloped and compiled indepcndeitly from the pattern matching routines. The pack- 
ng facilities allowed by the available compiler are at present incomplete (see the 
Recommendations in |4 of this paper for a discussion of the shortcomings of the current 
version of the compiler used in this work) but it wad possible to demonstrate within 
them a measure of encapsulation. A fully validated Ada (one that implements the full 
definition of the language) will enable more extensive utfe of the package to build hierar- 
chies oMibraries of packages, with each library at a given level containing packages use- 
ful to the applications at the faext higher level. Thus at the bottom levfcl the libraries 
would contain packages of gfbcrally useful abstract data types such as stacks, queues, 
linked lists, sequences, strings (a more complex variety of string than that built into 
Ada), matrices, etc., defined in terms of the built-in types provided by; Ada. At the next 
higher level would be packages that could use these lower level defined types. For 
instance, the pattern matching packages would be defined at this fevel. At the next 
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higher leveltarould be packages utilising complex structures such as patterns — for 
instance, a compiler could be defined at this level, using the pattern matching algorithms 
to do the parsing. ! * 

The first working definition of a pattern was: 

— A PATTERN is, an (unconstrained) array of ALTERNATIVES. 

—An ALTERNATIVE is an (unconstrained) array of BEADs. (Be»d i* used here to 
correspond to SNOB0^4 terminology (Griswold el «/, 1971].) , 

* 

— A BEAD is any of 

(i) a string; 

(ii) a PATTERN} 

(iii) a primitive function (primitive functions selected corresponded to the roost 
useful SNOBOL4 primitives.) _ 

The available compiler, though it supports unconstrained arrays, does not support 
size-variant records, so the utility of using unconstrained arrays in a package is limited. 
The next working definition for a pattern therefore made an Alternative a linked list of 
Beads. Unfortunately, without generic packages, a linked list could not be conveniently 
defined outside the pattern package. Thus, although the type Alternative was imple- 
mented as a linked list, a Linked List type was never explicitly defined. The working 
definition for a Pattern then became* in Ada 

type pijm-func is (ARB1, REMAIN1, PO$l, SPAM, ANY1, 
£ ■ . NOTANYlj BREAK1, TAB1); 

CJpe Primitive is record §> 

Name : prim-func; + . *' 

Arg : string-pointer; * 
eiy} record; - *» 

type Pattern; 

type Kinds is (terminal, non-terminal, operation, 

R, L); 

~^R and L are used to hold the left and right 
~ unmatched substrings 

type Be*d(Kind : Kinds) is record 

cm* Kind is <* 
when non-terminal — > Choice : Pattern; 
when terminal — > Str : string-pointer, 
when operation ««> Op : Primitive; 
when R —> null;' 

when L **> null; ^ 
end case; 
end record; 

type alt-pointer; 
type Alternates is record 
C :Be»d; 

' . ■ 22 29' - • . 
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next : alt-pointer; 
' end record; / 
^ type alt*pointer is access Alternates; 

type pats is arr*y(Poeitive ra,ngc <>) of Alternates; 
type Pattern is access pats; 

Finally a type Sya-Pat was introduced, so that the pattern building functions cpuld 
distinguish between internally generated swbpatterns arid actual user-defined patterns. 
For example, the pattern 

p — ( « a w + "b n + V 1 ) y ^ 

Should have a length of 3; but the pattern 

p „ (A + V), 

where 

A - ("a" + "b"), . 

should have a length of 2. By overloading the pattern building functions, the compiler is 
forced tp choose the proper representation in both cases. 

The problem of pattern matching is given a pattern and a target string, to find a 
substring such that for each set of alternatives a bead can be found which matches the 
substring starting at the point where the last set of alternatives leaves off. In some 
schemes the substring may start either flush left, flush right, or anywhere within the tar- 
get string, depending on a positional indicator passed Ho the pattern matching algorithm 
along with the pattern anc^the target string. Since the patterns used here include the 
ARB primitive function (which matches any arbitrarily long string of characters) posi- 
tional indicators have been left out of this initial work. All patterns are matched flush 
left. However, provision has been made to include positional indicators in future ver- 
sions. 

3. SOME ALGORITHMS FOR PATTERN MATCHING 

1.1. BACKTRACKINGS RECURSIVE DESCENT 

The most intuitive approach to the ^pattern matching problem is to try every possi- 
bility for each pattern element individually. This leads to the "backtracking" method. 
This method starts by trying each bead, for any given set of alternatives, until a match 
is found. Then for the next set of alternatives each bead is tried, etc., until all sets of 
alternatives have been matched. If for any set of alternatives no bead matches, then the 
algorithm backtracks — that is, the previous set of alternatives is tried, again, starting 
from the bead thjtt just matched. Qlearly every possible parse of the string will be 
found in this fashion, but there are several problems which arise with this method which 
will be discussed later. v % 

A typical way of implementing the backtracking method, and the way thatrl chose, 
is the so called recursive descent parsing algorithm*. As the, name implies, recursion is 
used extensively by this method, especially if the bead being matched if itself a pattern. 

~ — - ■ 

fSet of the flrrt paper in thU report. *td*] 
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(Recall that a bead may be fither a. string, a pattern^ or a primitive, function.) In this 
case the recursive descent algorithm calls itself (i.e. recursion), passing Che Value of the 
bead (i.e. the pattern) anfl the cursor position of the string. The positional indicator, if 
used, would be flush left. In this fashion the algorithm "descends" with recursive calls 
unftl the bead being matched is not a pattern. At this point, if the bead is a string it is 
matched against the target. If the b*ad is a primitive function, then a string is derived 
from the function, the target string and the cursor position. This derived string is tb*n 
matched against the target. If this matching is successful the next set of alternatives at 
this level of descent is tried, (This may be done by recursion also, by iteration, using a 
stack, or by coroutines.) After eacb^et of alternatives is matched (or if no rtratch is 
found), the algorithm returns to the next higher level with the matched substring^r the 
null string if do match is found). 

3.2. COROUTINE IMPLEMENTATION OF BACKTRACKING 

An elegant way of implementing the backtracking aspect of the algorithm — that 
is, when no match is found, returning to a previous set of alternatives and resuming 
where the algorithm left off — is through the use of coroutines, which in Ada are ta$k$* 
The task starts by examining each bead in the first set of alternatives. For each bead 
that successfully matches, a new task is started, which examines the Remainder of the 
string and the remaining sets of alternatives. When the last set of alternatives has been 
examined, the task passes back the matching substrings (or the null string if no match , 
has occurred) and terminates. The parent task then adds its substring to the beginning 
of each tree on the list which has been passed to it. This new list of treps is then passed 
back, and the task terminates, etc., until the topmost task completes all possible parse 
trees. Thus although backtracking takes place (each possibility is considered individu- 
ally) it occurs with a degree of concurrency dependent on the run-time environment. 

Unfortunately, once again the available compiler does not have tasking as ode of its 
features. The process described above can be implemented as a function, but with the 
loss of concurrency and elegance. Furthermore, as shown in the analysis below, back- 
tracking can be * Jfjfl Y c05 ^'y wa Y °' conducting pattern matching* Some of this cost 
can potentially be absorbed by concurrency, where the system allows, but the implemen- 
tation and run-time analysis of this must await a validated Ada (so thfct concurrent 
tasks can be incorporated into the algorithm). The run-time analysis could then con- 
sider both time and resource utilization. As multiprocessors appear this analysis could 
provide some interesting insights into time consumption versus resource demands. 

Twp noteworthy problems exist with the pattern matching method outlined above. 
The first occurs if the pattern itself is left recursive — that is, it has the form P «* P # 
+ A, where P 1 is a pattern which can produce P, and A is any pattern (possibly null). 
The recursive descent algorithm will examine P by first considering its first set of alter- 
natives, i.e. P' . This will cause a recursive call, so that P # is considered. Since P # 
can produce P, eventually the algorithm will rifcursively consider P, which then causes a 
recursive call to P # , eventually leading to another call 1 to P, etc. without, ever having 
advanced the target string cursor. Thus the recursive descent method goes into an 
infinite loop if it encounters a left recursive pattern. Fortunately this is not a major 
problem since it has been shown that any patjtern Can be generated by a pattern in 

fSee \\2 of the flrit paper In thli report. *«rd.] 
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Greibach Normal Form, which means the pattern has one of the following forma:. 

P = a + PI + P2 + ... j- Pn 
P = a 

• ■ 

P — null ■ * ■ 

where a is any string, and /the P's are all patterns. Clearly a. pattern in Greibach Nor- 
mal Form cannot be left recursive, so if any given pattern is first modified into this 
form, the recursive descent algorithm will work.*. Although this problem of handling left 
recursion is not major since any pattern can be transformed into Greibach Normal 
Form, the problem of the time requirements of the backtracking algorithm in the most 
general cases of patterns remains. Consider as an example the pattern P ("a" + P + 
P) or null Suppose this pattern is to be matched against a string of a's. Clearly the left- 
most a will be matched by the V of P. All other a's can be matched by either tht first 
(recursive) occurrence of P, or by the second, independent of how any previous or subset 
quenJr a's are matched. Thus if the string is L long, combinatorics tells us that the 
number of possible parses is 1 + 2 L — that is, the number of parses is exponential. 
Since the backtracking algorithm considers each possible parse individually, it will 
require exponential time to parse such a pattern. So, although the backtracking method 
may be useful given certain restrictions on the allowable patterns, in the most general 
case the time constraints become burdensome. 

3.3. USE OF EARLEY'S ALGORITHM 

Obviously, the way to reduce the time costs of parsing is to not treat each indivi- 
dual possibility by itself, rather to group them into classes. In the above example, for 
instance, there is no need to consider botlh of the recursive P's individually since they 
both reduce to the - same tree. Both P's can be considered in parallel, and then the above - 
example will parse in linear time! Even with more complex examples, it can be shown 
that by developing a Scheme to consider similar possibilities in parallel, parsing on be 
accomplished in polynomial time, a vast improvement over the exponential time required 
by the backtracking method. One such scheme, which can be implemented without any 
initial manipulation if the pattern is known as Earley's algorithm. 

Earley's algorithm is described in its mathematical details in (Harrison, 1978) where 
a modified (improved) version is called simply "a good. practical algorithm". The main 
problem I encountered in implementing this algorithnr was developing reasonable data 
structures to represent the rather complex mathematical formulas introduced — patterns 
must be converted to "dotted rules", a triangular matrix of dotted rule* must be 
created, and the functions "X", and "predict" must be implemented.* Once again 



fit should be mentioned here that it is not common for patterns to call themselves recursively. Recursive 
patterns are, however, a possibility that one might not want to exclude, and are very handy in tome In- 
stances. In SNOB0L4, they are implemented through the use of "unevaluated expressions", and heuristics 
are used to prevent infinite loops of recursive calls (which, in implementation, would tend to cause a stack 
overflow). A good discussion of the use of unevaluated expressions .in SNOBOL4 can be found in |Griswold, 
I075|; the heuristic mentioned is also discussed in (Griswold, 1084). In Prolog, there arc also problems with 
left recursion. These can be solved either by automatic transformation of left-recursive clauses or by check- 
ing for the occurrence of particular states (see (Enalls el < 1084)). -«d.] 

pThe "X" operation is used to implement Earley's "scanner" (see §3.1.3 in the flrat paper of this report), 
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the effort was hampered by the lack of generic packaging facilities in the currently- 
available compiler. The development process certainly could have made good use of a 
lower level ot abstract data types whi|h included sets anil matrices. As it was, these, 
structures had to be developed concurrently ijrith the xest of th$ algorithm. 

Another problem encountered in implementing Harrison's version of EarleyV algo- 
rithm is that this version (as most are) is developed for a contcxWfrcc grammar, not, for 
pattern matching. Although for the most part context-free parsing is analogous to pat* 
Urn matching, the analogy breaks down when the primitive functions are considered. 
These primitive functions are in general string and cursor dependent, and so have no 
context fre* representation. Since they are at most dependent on the string and cursor, 
though, it was possible to alter the "predict" function so as to proluce simple string 
derivations of each primitive function as it is encountered during j^jbe parse. This 
increases the time requirements as compared to a simple context-free parsing problem, 
but the modified algorithm still requires n6 more than polynomial time/ 

As can readily be seen from the above discussion, although Barley's algorithm is 
faster than the backtracking method, the price is paid in the complexity of algorithm 
and the space it takes while running. The complexity also may make it more difficult 
to develop Earley's algorithm to take advantage of a concurrent environment. Now 
that the algorithms have been developed into working programs, it remains to be stu- 
died whether the difficulties of Barley's algorithm outw^gh its benefits. ' 

4. COLLUSIONS 

Two working programs in Ada have been produced, utilising two different methods 
for pattern matching, but 'clearly, more work remains to be done. To provide maximum 
utility to future users and researchers, the programs developed should be rewritten in a 
fully validated Ada, making use of the packaging facilities as detailed in the <E>oD 
specifications for the language. In the specific case of tfic backtracking algorithm for 
pattern matching, the rewrite should aba include t^e^ise of tasks. In this fuller Ada 
environment, the two methods of pattern matching could better be tested against each 
other in real time, to provide a comparison 01 their relative merits in time, space and 
concurrency. 

Two other suggestion regarding Ada have arisen from these efforts: First, Ada 
makes no provision for treating functions as data types, Such a treatment is especially 
useful in pattern matching, where it is desirable to associate an action to be taktn with 
a pattern to be matched, as in Post-X. Second, when producing large systems as is often 
the case in AI programming it Would be beneficial to be able to declare subprograms 
within a package to % be external, so as to be able to compile them separately from the 
rest of the package. Although the separate compilation of the packages themselves is 
very useful, in complex systems the package itself may grow to a cumbersome point, 
with each update requiring inordinate amounts of cotapile time. Facilities for external 
compilation help to relieve this load* ' 

In this paper, we have discussed methods of implementing within Ada the eentraf 
facilities for pattern-directed programming with character strings (which could be 



and both the 11 X" and "*" operations ire need to Implement Etrlejr'i "eomplster" In HsrrWon'f venlon of 
the algorithm. -od.J 

("Bee |3.3 of the flrtt paper in this report for a dlseumlen of the modlflettlone needed. «od.) 
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extended to other datatypes as suggested in (Wnuchope, 1984)). Work is continuing in 
comparing these methods and testing their efficiency. 
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A pattern-directed Uit prortwing facility for the Ad* programming language it presented. 
Pattern lieta for matching again* source lieta are constructed from a set of SNOBOL4-derived 
primitives which have been extended to be applicable to arbitrarily complex LISlMike data struc- 
tures. Patterns may also contain user-defined symbols, which can serve as nonterminal symbols of 
a context-free grammar. Basic list creation and manipulation are made available to the program- 
mer via a package of LISP- like functions and data types. Several examples of possible applications 
in Artificial Intelligence are explored-foeusing on computational linguistics problems «uch as 
transformational grammar and parmng-demonltrating the construction of patterns and the use of 
various operations available for testing and manipulating the values which the matcher returns. 

1. INTRODUCTION 

7* The Ada language, with its goal of being the exclusive high level programming sys- 
tem used in the Department of Defense, includes data abstraction facilities that in effect 
make it possible to extend the language by creating new data types and defining the 
operators that are to act upon them. For specialized ajcas of application, a programmer 
. can invoke the appropriate data abstraction (package) and proceed to write code using 
the new high-level constructs it provides, just as if using a new language specifically 
designed tor that problem domain. One application area of potential interest to Ada 
ufcers is artificial intelligence-including the field of computational linguistics, which offers, 
such possibilities as natural language interfaces with computers, text understanding 
and/or information retrieval, natural language programing, and machine translation. 
Programming tasks in this category are usually unflerfaken using specialized string- or 
list-processing languages such as SNOBOL4 or LISP, and expending Ada's ability to pro- 
cess data structures of this sort would greatly facilitate the development of language 
processing and other Al-related systems in that programming environment. 

Pattern matching is a computational paradigm that is particularly appropriate to 
language processing applications, as a language can generally be described in terms of a 
series of syntactic patterns and subsequent pattern-directed semantic mappings called a 
grammar. If an input sequence of terminal symbols successfully matches the grammar's 
set of patterns (rewrite rules), then it is a legitimate sentence in the language and 
appropriate further, actions (creation of a parse tree, or mapping to a deep structure or 
meaning representation) can be carried out. Pattern matching can also be used to drive 
the process in the opposite direction, such as matching certain deep structure kernels 
and then performing appropriate grammatical transformation^ on them to yield new sur- 
face structure sentences. Many other AI applications also employ production rules that 
are fired by a suctilssful ffialching of symbolic state conditions, and so pattern matching 
can be used to perform^such tasks as formula unification, symbolic differential and 
integral calculus, and similar problems involving sequences of symbols that aire to he 
analysed for content and structure. 

% •Current »ddr«M, N»vy 0«nt«r for Applied Rmuch in ArUflebl InUllipnet, Code 7S10, tivrtL lUteireh 
Laboratory, WMhington, D.O. 20474. 



to 

I 



T 



ERIC 



Wauchope n u.t Processing in Ada 

« ' < 

The objective of this research was to design and implement a pattern-directed list 
processing package in Ada to test the hypothesis that such a package would provide a f 
practical and useful facility for artificial intelligence programming. A package of list 
data-types and list-manipulation functions were created. List patterns could then be con- 
structed out of SNOBOL-like pattern matching primitives, and the matcher was to . 
return a list of short-term variables corresponding to the values matched (perhaps only 
partially) by the pattern components. These values should then be capable of being 
tested, concatenated, or subjected to further matching. Once the matcher was opera- 
tional,, it would be tested in various areas of application (concentrating upon computa- 
tional linguistics problems) to determine the usefulness of the various matching primi- 
tives and the operations \jpon the returned values. 

2. DESIGN OF THE PACKAGE 



2.1. LIST PROCESSING 



4 



The most widely used AI programming language in the United States is LISP, 
which represents sequences of symbols (atoms) as binary linked lists. The primary list 
manipulators are CAR and CDR, which return the first element and remainder of a list, 
respectively, and CONS, which creates a new list out of a pair of elements (themselves 
either atoms or lists). Various predicates are also available to test the identity of data 
items, and more powerful list manipulation functions can be built up out of these 
simpler ones. 

Trees are a natural way of representing the structural composition of sentences in a 
.language, and binary lists caLbe made to accommodate these structures quite easily. 
For example, a parse tree for^be is in the garden" can be represented by the binary list 

(S(NP(Pro(he))VP(V(is)PP(P(in)NP(Det(the)N(garden)))))), 

where the constituents of each phrase marker are to be found aa^ sublist immediately 
following it. ' • 

The initial task toward creating a list pattern matcher in Ada was to provide 
means for the creation and manipulation of atoms and lists/This was accomplished by 
defining "S-Expression" as an abstract data type, with its internal structure (either a list 
node having left and right child pointers, or atom node having a name field, value field 
and next pointer) hidden from view so that only the LISP functions exported from the 
package could be used to operate upon values of the type, S-Expression objects are 
created by a function "Quote" which converts Ada .strings (representing properly bal- 
anced S-expreasions) into linked-list structures; the function bears little resemblance to 
the LISP Quote (which suppresses evaluation) since no LISP interpreter is actually 
involved, but the name was borrowed because of its analogous function. The most use- 
ful core LISP functions and predicates, as well as several higher-level ones (such as 
Member and Append), constitute the remainder of the operations available on the 
abstract type. 

1.1. LIST PATTERN MATCHING 



Since patterns would be constructed by the user in. the same form as the source 
lists (i.e. parenthesised strings of symbols), it was decided to convert the patterns them- 
selves into lists (using Quote) and then perform the matching by stepping through each 
list and mapping corresponding elements onto each other. The matching itself is thus a 
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HsMraversal process and so was implemented in Ada in much the »me way as a pattern 
matcher would be written in LISP itself, using recursive procedures written' in the Ada 
pseudo-LISP. 

To adapt SNOBOLrlike string matching primitives to the job of list matching, two 
versions of each primitive were defined: the first class matching list components (i.e. the 
CARs of each node in a particular sublist, which can be either atoms or lists), and the 
second matching, individual atoms at arbitrary depths of nesting in the tree: Once the 
pattern matcher became operational, it would then be possible to determine what use 
might be made of the two classes of primitives in actual applications. The primitives 
implemented are listed below/ 

— — CIm* Ii Lift-Component Primitive* ~4 ~ 

LIT(t): Matches if the next list element of the source is equal tp the element e (atom or 

list). Examples: LIT(hello), LIT( ((hi)there) t 

" « 

LEN(n): Matches * series of n list elements (atoms or lists)* Example: LEN(5), 

9 BAL: Matches an arbitrary number of list elements* ? 

ANY(s): Maters if the next list element of the source is a member of the sequence of 
elements s. Examples: ANY(boy cat dog), ANY(atoml (list 2) ((list)3) ). 

NOTANY(s): Matches if the next list element of tw source is not a member of the se- 
quence of elements s. Examples: NOTANY(bad worse), NOTANY( (real 
♦badXeven(worse)) ). J 

BREAK(a): Matches all list., elements until one is encountered that is a member of the 
; sequence of elements a. Examples: BREAK(stop), BREAK((go(no)further)). 

SPAN(a): Matches list elements until one Is .encountered that is not a member of the se- 
. quence of elements s. Examples: SPAN(ok good), SPAN(yes (fine))* 



Claw Ut Leaf (Atom) Prlmtttvee 



LITL(a): Matches if the next atom in the source is the atom "a". Example: 
LITL(hello), 

LBNL(n): Matches the next n atoms in the source* Example: LENL(5). 
ARti: Matches an arbitrary number of atoms (possibly none). 

ANYL(s): Matches if the next atom in the source ts a member of the sequence of atoms 
i s- Example: ANYL(one two three). 

NOTANYL(s): Matches if the next atom iq the source is not a member of the sequence 
of atoms s Example: NOTANYL(bad no). 

BREAKL(a): Matches all atoms until one is encountered that is a member of the se- 
quence of atoms s. Example: BREAKL(stop). . * 

SPANL(s): Matches atoms until one is encountered thfct is not a member of the se- 
quence of atoms s. Example: SPANL(go line great). 

*~ — -—Additional Operator e^^^^-^--*--- 

» REM: Matches the entire remainder of the list (possibly empty). 

ALT: Attempts to match the first pattern in iti argume nt list followed by the remainder 
of the original pattern. If the match fails, it tries the next argument, and so on. Exam- 
ple: ALT( (SPAN(a)) (SPAN(a b)) ). 
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When matching a Class I primitive in a pattern, the matcher steps down the spine 
of the corresponding source sublist and matches the elements found hanging off of it. If 
, the pattern branches off to the left, the matcher recurses on the CAR of each list and 
then returns (if successful) to proceed down the spine once again, also recursively. When 
matching against a Class II primitive, however*, the matcher begins a depth-first tree 
traversal in search of the atoms to mateh against the primitive. Here the problem arose 
of how to allow such a search through an arbitrary tree structure while still retaining the 
recursive nature of the matcher; a leaf-matching might leave an "orphan" of subtree left 
over w^ith no way to bridge back up to higher unmatched levels of the tree for further 
matching. The decision was thus made to continually reform the unmatched portion of 
the source tree back into a single well-formed tree of comparable structure when doing 
leaf-matching, making further recursion always possible by matching the entire remain- 
ing tree at each step. (The backtracking operators BAL, ALT and ARB retain the value 
of the original tree to return to if necessary.) In essence, then, the source tree is pruned 
of each successful malch and any resulting empty list nodes are condensed out. It is 
thus possible to freely combine primitives from the two classes in a single pattern, 
although use of a primitive from Class I must always accurately reflect the structuse of 
the remaining source which it is to match. For example, the source 

(a (b (c d),e) f) 
will be successfully matched by both patterns 

(SPANL(a b c) LITL(d) LITL(e) LITL(f) ) 

and * 

(SPANL(a b c) (( LIT(d) ) LIT(e) ) LIT(f) ), 
where th* bracketing in the latter pattern-is needed to specify the depth at which each 
litera) must \^cur. , , 

v V ! made, the portion of the source tree 

matched is into aj^tyut list In the pWion corresponding to the sequential 

portion of the primitive within the pattern, tf a Viatch is not completely successful, the 
values of the partial matdbes are retried. This list corresponds to the immediate vari- 
able assignments made. -in 1 the course of * StfOBpU string matching, and is available for 
ex^minatKJn and manipulation until tfc* next pattern matching is undertaken. Both 
classes of priniitiy* rejuni the portion of the source tree that was traversed in making 
th* match; except that the values returned by Class II primitives are pruned of any 
superfijjtous higher level list nodes that may have been traversed in reaching the "fruit- 
ful" branche^actualrjr matched 

In addition to these literaimatching primitives and operators, pat/terns may also 
contain user-defined symbols, which^are atoms that fiave had values associated with 
them using the "Setq" procedure (like "Quote", a borrowed and somewhat redefined 
LISP function). For example, a pattern could b« constructed as follows: . * 



Setq"( 44 Diait M ,."(ANY(01 2 8-4 5.ft-7 8«)) M ); 

S$tq ("DIGIT«5", "(ALT( (DI^IT) (DIGIT DIGITS) ))" ); 

ReaLNoJ>at: constant string "(DIGITS LIT(.) DIGITS)?'. 



Mi 

f 



When a uter»d«fined symbol is encountered during matehifig,vitt vakie if substituted for 
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the symbol in the pattern, and matching then continues. . * . ' 

.* * 

ACTIONS • ■ \ 

After a pattern m(ktch has been performed, the values returned by the matcher ar$ 
available for subsequent actions Such as testing, concatenation into new lists, or further 
matching. They are accessed Vrom the result list by use of an infix operator "/"(U 8 M) 
which selects the nUh member of the list and returns it. Such LISP predicates as Equal, 
Member, and Nullp are available for testing of these values, and actions taken caoribe 
conditional upon their results. Catenation of values is possible using standard LISP 
Cons or Append, but more convenient is the Ada string concatenate "& n whioh has 
been overloaded to append lists as an infix operator (string values can also be &'ed with 
list values if they represent properly balanced lists). The function List is also useful in 
correctly structuring the output desired, by forming its argument into a sublist. 

An exarpple of pattern matching that illustrates the construction of patterns and 
the use of severafof these action operator's is the following (highly simplified) grammati- 
cal transformation: 



Function Pronoun JSubst (Source: S JSxpr) return S^Expr is. 
Success; boolean; *' 
T:S_Expr; 

Pattern: constant string :— 

"(LIT<S>[LEN<2>L^ 
begin - 

Match(Pattern,Source,Success,T); t v 

if Success then ^ 
if Equal(T/2,T/ft) then return 
T/l & List( 

T/2 &T/3& List( - "~ 

T/4 & T/5 & Llst( 
^ ^ «(Pro(he))» & T/7 ))); < \ 

else return Source; 
end Pronoun jSybst; , ' 



PronounJSubst (Source) for the input ' 

(S(NP(N(John))VP(V(said)S(NP(N(John))VP(V(was)Adj(rich)))))) 
returns the transformation 

(S(NP(N(John))VP(Y(s^^ 

>. APPLICATION EXAMPLES v ^ 

3.1. PARSING 

One application of user-defined symbols in patterns is to serve as grammatical 
rewrite rules, where the symbol represents a nonterminal and lis value represents the 

j 
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right hand side tk be expanded to. When doing parsing, the list returned by the* 
matcher amounts to a parse tree of the input source, and an additional input parameter 
to the matcher can cause nodes of the parse tree to be labeled with the appropriate non- ; 
terminals, as well. As currently implemented, non-left^recursive context free grammars 
canbe handled by the matcher, and are parsed by recursive descent. The matching^ 
function package contains a procedure "Parse" which takes as inputs a start symbol (the 
pattern being matched against), source, and label switch, and returns a success/failure 
boolean and parse tree, as illustrated below: 



Setq("S", "( NP VP)" ); 

Setq ("NP", "( ALT( (N) (Det N) ))" ' ); 

Setq ("N", i<( ANY(ship plane pilot) )" ); 

Setq ("Det", "( ANY(a the) )" ); 

Setq ("VP", "( ALT( (V) (V NP) ))" ); 

Setq("V", "( ANY(flew sailed) )" ); 

Start_Symbol: constant string := "S"; 

Source: constant string := "(the pilot flew a plane)"; 

Label: boolean := true; 

Success: boolean; 

Tree: S_Expr; 

• Parse (Start_Symbol, Source, Label, Success, T.ree); 



"True" is returned as the value of Success, and for the list Tree, 

(S(NP(Det(the)N(pilot))VP(V(flew)NP(Det(a)N(plane))))) 

is returned. 

The output of the parser is, clearly, in proper form for further pattern-directed pro- 
cessing such as grammatical transformation, as outlined earlier. 

In order to return a parse tree, a matcher must retain the portion of the result that 
was matched by each non-terminal symbol so as to make it a (possibly labeled) sublet of 
the tree. One approach to enable this would be to do a, "partial match" of the symbol^ 
right hand side against the- source, and if successful then match the remainder of the 
pattern against the remainder of the source and append the results. This approach, 
however, makes it difficult to backtrack to another alternative (such as ALT, BA4> or 
ARB) in the right hand side if the matching on the remainder fails. In the present work, 
the matcher avoids this problem by always doing a "complete" match (with backtrack- 
ing) on the entire pattern, and remembers where each subpattern is to end by the inser- 
tion of an end-of-phxase marker into the pattern after each right-hand-side substitution. 
When it encounters one pf these markersNiuring the subsequent matching, it knows to 
lump the previous results at that level into a separate list, which is stored in a level- /j 
indexed array of subresults that are eventually assembled into the final parse tree. 
When not doing parsing, however., the matcher instead lumps together the values that - 
were matched by each primitive in the pattern, so' that these values can be separately 
accessed from the result list after matching Is completed. 
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I J. SYMBOLIC DIFFERBNTIATJON 

> Another application program written employing pattern matching was a procedure 
to perform symbolic differentiation -of arithmetic expressions. The expression is first 
parsed using a grammar of operator precedence, such as 

<S> — > <T> + <S> | <T> - <S> | <T*> 
• <T>-> + <U>| -<U>|<U>. 

- <u> «i»> <v!> ♦ <u> | <v> / <u> | <v> ■ 

<v> -> <vv> <v> I <w> \ 
: /. q <w> -> 1 1 ( <s> ), . / . 

and if successful an unlabeled parse tree is*returned. Fpr example, parsing - 

(1 * 2 ** - 3 + 4) * ■»•■ 

returns the tree ', » 

/ , ((1 ♦ (2 (--3)))-+ 4), \ 

indicating the correct order of operator application, yhis tree is*\hen subjected to a 
series of pattern matches and subsequent actions to generate the derivative, as abbrevi- 
ated below: * , » 



Function Deriv (free: SJExpr) return S JSxpr is > 
begin 

Match( "(LBN(l) ANY(+ - * •/ ♦*).LEN(1))", free, Success, R);^ - 
if Success then * 

if Eq(R/2, "+") or $q(R/2, "-") then 
* -D(x+%y)-D(x)+-D'(y) 

return Deriv||l/1) & R/2 A Deriv(R/3); * 

*' elsif Eq(R/?, "♦") then 

-D(x*.y}-D(x)y + xb(y) 

return |,ist(Deriv(R/l) k & R/3) &v 

List(5/1 & & Deriv(R/3)); 

t ' 

else ... et 
else . 

Match( "(ANYJr>) LEN(l))", Tree, Success, R);. .... etc. 
end Deriv; 5 




Deriv( u (X *♦ 2 + 5)"), for example, returns the tree 

\ . \ ((2 (X [2 <r I)* 1)) + 0), . . 

to which additional pattern-directtd processing might then be applied to reduce the tree 
to (2 ♦ XK<" Not* that a simple rearrangement of the^LIT and! ANY primitiws in the pal? 
terns ftrald process a parse in prefix or postfix form. ' 
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4. EXTENSIONS AND FUTURE RESEARCH 

• N - ' * 

44* ALTERNATIVE ADA IMPLEMENTATIONS 

This approach to a pattern matcher for lists was inspired by ^Post-X [Bailes and 
Ree|c«r, 1980), an applicative pattern matching language in which patterns have, been 
generalised into a structure called a Form, consisting of an alternating series of patterns 
and corresponding actions combined into a single data object. In Ada, Form could be 
realized as a generic package that would be instantiated with a. pattern part and an 
action part, hence serving as a template for new data objects of this type. In the version^ 
of Ada that was available for this research, generic packages and procedure-variant gen- 
eric, functions had not yet been implemented, 4nd so more powerful pattern matching 
structures such aa Forms could not be created except as ad«hoc procedures or clauses. 
The applicative approach to programming used in Post-X and espoused by Backus [1078] 
and others can alsq be realised in Ada through the use of generics and abstract types, 
and so further work on this project using a more complete Ada compiler would lead 

closer to the Post-X design. 

•» . ■■"«. . 

4.3. PATTERN MATCHING IMPLEMENTATION 

Pattern matching itself can be considered a form of parsing: for instance, the pat- 
tern (BAL LEN(l)) can be represented by the grammar 

S - BAL LEN1 
BAL 1 1 1 BAL 
• • f , 9 LEN1 — t 

and> matching a list against the pattern is equivalent to returning a^parse of the list in 
terms of- the grammar. In this regard, a backtracking pattern matcher is equivalent ta a 
recursive-descent parser, which is limited in the^ classes of grammar it can accept and 
runs in . exponential time as well. Kreuter [1984] has implemented a string pattern 
matcher in Ada using Earley's parsing algorithm, which has a worst case time behavior 
of N cubed and a more powerful grammar handling capability. Earley's algorithm could 
certainly, be applied to this list pattern matcher as well and thus provide substantial 
improvements. Alternatively, heuristic methods such as SNOBOL's Qutckscan mode 
could be added to the backtracking design to prune the search space and afford speed- 
ups. " 

4.1. SELECTORS 

In Post-Xf simple pattern matching returns its result in the form of a tree 
corresponding in structure to the pattern used; valued are then accessed by multiple Use 
of a selector operator, e.g. R/3/2 would seleet the second' subtree of the result's third 
subtree. In the present work, values are instead returned as a linear list, Bach approach 
might be useful in certain applications, and the current matcher could be easily modified 
to allow the user to select which result mode was desired, 

4.4. LEAF MATCHING PRIMITIVES 

In writing application programs for this matcher, only small use was made of the 
leaf-matching primitives. Further research should determine areas where these operators 

might prove more powerful. 

> ■ 
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4.5. CONCLUSION 

The programming of pattern-directed packages for^J variety of datatypes within 
Ada appears not only feasible, but quite worthwhile. Work is continuing in the areas 
described above. * '•*•'.'' 



' / 



\ 



4fe 



.36 



43 



Re«ker, Kreutor Si Wmuchop« 



Pattern-Directed Al in Adm 



REFERENCES 

> 

Aho, A. V., B. W. Kernighan and P. J. Weinberger [1070]. Awk — a pattern scanning 
and processing language, Software— Practice and Experience, 9, 267-279. 

Backus, J. (1978). Can programming be liberated from the Von Neumani* style? A func- 

\ tional style and its algebra of programs, Communications of the AssoJtkr Comput 
Machinery, 31, 8, 613-641. % 

Bailes, P. A. [1983]. The Derivation of An Applicative Programming Language for String 
Processing, Ph.D. Thesis, Department of Computer Science, University of Queens- 
land. 

Bailes, P. A., and L. H. Reeker [1980a]. Post^X: An experiment in language design for 
string processing, Australian Computer Science Communication*, 3, 2, 252-267. 

Bailes, P. A., and L. H. Reeker [1980b]. An experimental applicative programming 
language for linguistics and string processing, Proceedings, 8tk InJi Conf. on Com- 
putational Linguistics , Tokyo, 520-525. 

Bolz, R. [personal communication]. A package of string functions in Ada. 

Clocksin, W. F., and C. S. Mellish [1981]. Programming in Prolog, Springer-Verlag, Ber- 
lin. 

Dallman, Brian [1984]. AFHRL Program for Artificial Intelligence Applications to 
Maintenance and Training, Artificial Intelligence in Maintenance: Proceedings of the 
Joint Services Workshop, TR-84-35, Air Force Human Resources Laboratory, 
Training Systems Division, Lowry AFB, Colorado. 

Earley, Jay [ig70]. An efficient context-free parsing algorithm, Communications of the 
Assoc. for Comput. Machinery , 13, 94-102. 

Enalls, R., J. Briggs and D. Brough [1984]. What the naive user wants from Prolog, 
Implementations of Prolog Campbell (ed.), Ellis Horwood, Chichester, England 376^ 
386. . • t 

Farber, D. J., R. E. Griswold and I. P. Polonsky [1964]. SNOBOL, A string manipulation 

language, Journal of the Assoc. for Comput. Machinery, 11, 1, 21-30. 
Floyd, R. W. [1967]. Nondeterministic algorithm*, Journal of the Assoc. for Comput. 

Machinery, 14, 4, 636-644. 
Caller, B. A. and Perlis, A. J. [1970^ A View of Programming Languages, Addisori- 

Weslcy, Reading, Massachusetts. 
Graham, S. L., M. A. Harrison and W. L. Ruwo [1976]. On-line context-free recognition 

in less than cubic time, Proceedings of the Eighth Annual ACM Symposium on 

Theory of Computing, 112-120. 
Greibach, S. A. [1985]. A new normal form for context-free grammars, Journal #/ the 

Assoc. for Comput. Machinery, 13, 1, 42-52. 
Griswold, R. E. [1975]. String and Litt Processing in SNOBOL4: Techniques and Appli- 

cations, Prentice-Hall, Englewood Cliffs, New Jersey. , lt 
Griswold, R. E. [1984]. The control of searching and backtracking in string pattern 

matching, Implementations of Prolog, Campbell (ed.), Ellis Horwood, Chichester, 

England, 50-64. 

Griswold, R. E., J. F. Poage and I. P. Polonsky [1971]. The S NO DOL4 Programming 

Language, Prentice-Hall, Englewood Cliffs, New Jersey. 
Harrison, M. A. [1978]. Introduction to Formai Language Theory, Addison- Wesley, 

Reading, Massachusetts. 
Hewitt, C. E. [1969]. PLANNER: A J|nguage for manipulating models and* proving 

• 37 

44 



Reakar, Krcuter it Wauchope v 38 TitUrh»Dlroct«d AJ in Ad» 

V ' 

theorems in & robot, Proceeding* of the International Joint Conference on AI, 295- 
301. * 
Honeywell, Inc. [1083]. Reference Manual for the Ada Programming Language, 
ANSI/MIL-STD-1815A. Produced for the United States Department of Defense by 
Honeywell Systems and Research Center, Minneapolis, and Alsys, La Celle Saint 
Cloud, France. 

Honeywell, Inc. [1084]. Rationale for the Design of the Ada Programming Language. Pro- 
duced for the United States, Department of Defense by Honeywell 'Systems and 
Research Center, Minneapolis! and Alsys, La Celle Saint Clbud, France. 

Kay, M. (1080]. Algorithm schemata and data structures in syntactic processing, Techni- 
cal Report, XEROX Palo Alto Research Center, Palo Alto, California. 

Kreuter, John [1084]. Pattern-Matching Algorithms in Ada, Final Report, 1084 USAF- 

« SCEEE Graduate Student Summer Support Program. Edited version included as 
tke second paper of this report. 

Kuno, S. and A. G. Oettinger [1002]. Multiple-path syntactic analyser, Information Pro- 
cessing 08, Popplewell (ed.), North-Holland, Amsterdam, 300-311. 

Lewis, H. R. and C. H. Papadimitriou [1981]. Elements of the Theory of Computation, 
Prentice-Hall, Englewood Cliffs, New Jersey. 

Liu, Ken-Chih, and Arthur Fleck [1070]. String pattern matching in polynomial time, 
Proceedings of the Sixth ACM Symposium on Principles of Programming Languages, 
San Antonio, Texas, 222-225. • 

Markov, A. A. [1051]. Theory of Vgorithms, Trudy Mothemoticheskego Instituta imeni V. 
A. Steklova, 38, 175-180 [in Russian; English Translation, American Math. Society 
Trans., 2, 15, 1-14(1080)]. * . 

Pereira, F. C. N., and D. H. D. Warren [1083]. Parsing as deduction, Proceedings of the 
21st Annual Meeting of thk Association for Computational Linguistics, Cambridge, 
Massachusetts. 

Post, E. L. (1043]. Formal reductions of the general combinatorial decision problem, 
A me he en Journal of Mathematics, 05, 197-215. 

Reeker, L. H. and P. A. Bailes (in preparation]. A proposal for a graphic programming 
environment for flexible "languageless" programming. 

Rieger, C, and S. Small (1079]. Word expert parsing, Proceedings, Sixth Intl. Conf. on 
Artificial Intelligence, Tokyo, 1979. 

Richardson, J. Jeffrey [1983]. Artificial Intelligence: An Analysis of Potential Applica- 
tions to Training, Performance Measurement and Job Performance Aiding, TP-tt- 
It, Air Force Human Resources Laboratory, Training Systems Division, Lowry 
AFB, Colorado. 

itogers, Hartley, Jr. [1907]. Cambridge, Massachusetts. The Theory of Recursive Func- 
tions and Effective CemputabilHy, McGraw-Hill, New York, 1907. 

Sussmas, G. J., T. Winograd and E. .Chamiak (1971]. MICROPLANNER Reference 
Manual, AJ Mamo S08A, Massachusetts Institute of Technology, Cambridge, Mas* 
sachusetts. 

Tucci, Ralph ( forthcoming]. Analysis and Development. Master's Thesis, Department 

of Computer Science, Tulane University. 
Valiant, L. G. [1975]. General context-free recognition in less than cubic time, Journal of 

Computer and Systems Sciences, 10, 308-315. 
Warren, D. H. D., L. M. Pereira and F. Pereira [1977]. Prolog — the language and its 

implementation compared with LISP, Proceedings of the ACM Symposium on 

Artificial Intelligence and Programming Languages, Rochester, New York, 109415. 

38 > 

- 45 



Itoker, Kr«ut«r Sc W.uehop. 89 PaUern-Dlmtad Al in Ad* 

Wauchope, Kenneth [1984]. Pattern-Directed List Processing in Ada, Final Report, 1084 
USAF-SCEEE Graduate Student Summer Support Program. Edited version 
included as tke third piper of tkit report. 

Winston, P. H., and B. K. P. Horn (1981). LISP, Addisoh-Wesley, Reading, Mas- 
sachusetts. 

Yngve, Victor H. (1958). A programming language for mechanical translation, Mccheni- 
e§l Tr$nil*tion, 5, i, 26-41. 

*U.I. OOViRNMINT mHT\m OWOli 111! 8ff Oil 20021 



46 

3ft 



